Mon. Not. R. Astron. Soc. 000, 000-000 (0000) Printed 1 February 2008 (MN I*TeX style file vl.4) 

Comparison of the Large Scale Clustering in the APM and 
the EDSGC Galaxy Surveys 

Istvan Szapudi^ E. Gaztanaga ^ 

1. University of Durham, Department of Physics South Road, Durham DHl 3LE, United Kingdom 

2. Institut d'Estudis Espacials de Catalunya, Research Unit (CSIC), Edf. Nexus-104 - c/ Gran Capitan 2-4, 08034 Barcelona 



o 

CD 

Q 

oo 



> 

(N 
(N 

o 



1 February 2008 



ABSTRACT 

Clustering statistics are compared in the Automatic Plate Machine (APM) and the 
Edinburgh/Durham Southern Galaxy Catalogue (EDSGC) angular galaxy surveys. 
Both surveys were independently constructed from scans of the same adjacent UK 
Illa-J Schmidt photographic plates with the APM and COSMOS microdensitometers, 
respectively. The comparison of these catalogs is a rare practical opportunity to study 
systematic errors, which cannot be achieved via simulations or theoretical methods. 
On intermediate scales, 0.1° < 6 < 0.5°, we find good agreement for the cumulants 
or reduced moments of counts in cells up to sixth order. On larger scales there is a 
small disagreement due to edge effects in the EDSGC, which covers a smaller area. 
On smaller scales, we find a significant disagreement that can only be attributed to 
differences in the construction of the surveys, most likely the dissimilar deblending of 
crowded fields. The overall agreement of the APM and EDSGC is encouraging, and 
shows that the results for intermediate scales should be fairly robust. On the other 
hand, the systematic deviations found at small scales are significant in a regime, where 
comparison with theory and simulations is possible. This is an important fact to bear 
in mind when planning the construction of future digitized galaxy catalogs. 
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1 INTRODUCTION 

Clustering measurements from galaxy catalogues have be- 
come an important tool to test models of structure forma- 
tion. Large sophisticated data sets are currently under anal- 
ysis or construction. To interpret high precision measure- 
ments of clustering, a detailed understanding of the uncer- 
tainties is required. Errors can arise from finite size and ge- 
ometry of the catalog, such as discreteness, edge, and finite 
volume efi'erts ("msmin errors" V from the insiiflir.ient sam- 



the higher order clustering measurements, i.e. to what ex- 
tent difi^erent choices during the construction of a galaxy 
catalog can lead to difi'erent estimates of clustering. 

The most wide spread tools to study clustering in a 
galaxy catalog are the two-point correlation function, ^2, 
and the amplitudes of the higher order correlation functions. 
These latter are usually expressed in the form of hierarchical 
ratios: Sj — ^j/C2~^i where ^j is the J-order correlation 
function or reduced cumulant. The predictions for S.i's in 
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after SC96) and A^'-body simulations yield reasonable esti- 
mates. ,Systema,tic errors are even more difiicult to inve.sti- 
gate, a jid a unique opportunity is provided, when the same 
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raw data are reduced independently by two research teams. 
The goal of this Letter is to seize on such an opportunity: the 
APM and the EDSGC galaxy surveys were constructed in- 
dependently from the same underlying photographic plates. 



measure and interpret than the two-point function, however, 
at low orders, they are less affected by intrinsic observational 
uncertainties, like time evolution or projection effects. 

In section §2 we summarize the properties of the two 



In particular, we investigate the degree of reproducibility of catalogues, the method of analysis and the actual compar- 
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ison follows in sections §3 and §4. §5 discusses the implica- 
tions of the results. 



2 THE APM AND EDINBURGH/DURHAM 
SOUTHERN GALAXY CATALOGUES 

The APM Galaxy Survey covers 4300 square degrees on the 
sky and contains over 2 milli on galaxies to a limiting appar- 
ent magnitude of bj < 20.5 (^addox et al. 1990al JMa ddox 



et al. 1990b| ; [Maddox et al. 1990c| ; [Maddox et aL 199e| ). It 
was constructed from APM (a microdensitometer) scans of 
188 adjacent UK Illa-J Schmidt photographic plates and 
reaches a limiting magnitude of bj = 20.5. In an extensive 
analysis of the systematic errors involved in plate match- 
ing, Maddox et al (1996) have placed an upper limit of 
5w{6) ~ 1 X 10"'^ on the likely contribution of the systematic 
errors to the angular correlations. The shape of the angular 
correlation function measured from the survey at scales of 
^ > 1° indicates that the universe contains more structure 
on large scales than is predicted by the standard Cold Dark 
Matter scenario (Maddox et al 1990c). The higher order cor- 
relations in the APM were measured by (JGaztafiaga 1994 



hereafter G94 ; Bzapudi et al. 1995, hereafter SDES|; [Szapudi 
fc Szalay 1997a^ 



The EDSGC is a catalogue of 1.5 million galaxies cov- 
ering ~ 1000 square degrees centered on the South Galactic 
Pole. The database was constructed from COSMOS scans 
(a microdensitometer) of 60 adjacent UK Illa-J Schmidt 
photographic plates (a subset of the APM plates) and also 
reaches a limiting magnitude of b.j^EDSGC = 20.5. 

The entire catalogue has < 10% stellar contamination 
and is ^ 95% complete for gala xies brighter than bj = 19.5 
( Heydon-Dumbleton et al. 1989). The two-point galaxy an- 
gular correlation function measured from the EDSGC has 
been presented by Collins, Nichol, & Lumsden (1992) and 
Nichol & Collins (1994). The higher order correlations in 
the EDSGC were measured by Szapudi, Meiksin, & Nichol 
1996, hereafter SMN96. 

We emphasize that the raw data for both catalogs com- 
prise of the same UK Ilia- J Schmidt Plates (a smaller subset 
in case of the EDSGC) , while the hardware to digitize the 
plates and the the software to classify and detect objects, 
measure their apparent magnitudes were different. In par- 
ticular, different methods of calibration, plate-matching, de- 
blending algorithms were employed. As a consequence, there 
i s a small offs et in the magnitude scales of the two catalogues 
(Nichol 1992), even though a simple one-to-one mapping can 



be established. 

Magnitude cuts for the comparison of the statistics 
were determined by practical considerations. For the APM 
we follow G94 and use rriAPM = 17 — 20, which is half a 
magnitude brighter than the completeness limit. For the 
EDSGC catalogue, which is complete to about ttieds = 20.3 
magnitude, we follow SMN96 to use a magnitude cut of 
16.98 < rriEDS < 19.8, which is again half a magnitude 
brighter than the completeness limit. Based on matching 
the surface densities listed in SDES, these magnitude ranges 
approximately correspond to each other. This facilitates the 
direct cross-comparison of the results. 



3 THE METHOD OF ANALYSIS 

The calculation of the higher order corr elation f unctions fol- 
lowed closely the method outlined in (SMN96). It consists 



of estimating the probability distribution of counts in cells, 
calculation of the factorial moments, and extraction of the 
normalized, averaged amplitudes of the J-point correlation 
functions. For the most crucial first s tep the infinitely over- 
sampling algorithm of ( ^zapudi 1997 ) was used. Only few of 
the most important definitions are presented below. 

The average of the J-point angular correlation functions 
on a scale I is defined by 



Cjj{li) = A{l>)-' / dVi...dV.ja;.j(ri,...,rj), 



(1) 



where ujj is the J-point correlation function in the two di- 
mensional survey, and A{1) is the area of a square cell of size 
l. The hierarchical ratios, sj, axe defined in the usual way. 



S.J 



^j 



(2) 



The raw counts in cells measurements are reduced to a set 
consisting oi n,uji2,sj, which forms a suitable basis for sub- 
sequent comparison of the statistics; n denotes the average 
count in a cell. 

Counts in cells were measured in square cells with 
sizes in the range 0.015125° — 2° (corresponding to 0.1 — 
14/i~^Mpc with D ~ 400/i~^Mpc, the approximate depth 
of the catalogues). Practical considerations determined this 
scale range: the upper scale was chosen to minimize the 
edge effects from cut-out holes, while the smallest scale ap- 
proaches that of galaxy halos fo r the typical depth of the 
catalogs. For details see ( SMN96 ). Note that physical coor- 
dinates were used in both surveys to eliminate the effects of 
distortion. 



4 COMPARISON 

The amplitudes of the measured J-point correlation func- 
tions for 2 > J > 6 are displayed on a series of figures. 
To facilitate comparison with perturbation theory, angular 
scales in all graphs were converted to an equivalent circu- 
lar cell size, 9, i.e. tvO^ = (."^ . Note that square cells were 
used for the measurements, up to a small deformation due 
to projection. This has a negligible effect through slightly 
differing form factors, which cancels out anyway when com- 
paring the results from the two catalogs with each other. 
The cell size in the APM pixel maps is defined by dividing 
the full APM area over the number of cells. The correspond- 
ing scale is about 5% smaller than previously used in G94 
and SDES, where the cell size was defined as the mean equal 
area projection size. 

The mean density of the EDSGC counts is about 10% 
smaller than that of the APM (see also SMN97). This is 
partially due to star mergers which account to 5% of the 
APM images in the 6j = 17 - 20 slice (Maddox et al. 1990). 
The remaining 5% can be attributed to a small difference in 
the depths due to a slight offset in the magnitude slices. 

Figure hi shows the variance of counts-in-cells as a func- 
tion of the cell radius in degrees. The full squares linked by 
the solid line correspond to the measurement in the EDSGC 
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Figure 1. The solid squares joined by continuous lines show our 
measurement of the angular 102 over the EDSGC survey area with 
infinite sampling. Open squares with errorbars display the mean 
and variance of the 102 measurements in four equal parts of the 
APM survey, estimated with low sampling. The short-dashed line 
corresponds to similar APM measurements with infinite sampling. 
The long-dashed line shows the APM results restricted to the 
EDSGC region measured with infinite sampling. 



catalogue. The small differences in the mean depth men- 
tioned above should produce an upward shift of about 10% 
in the EDSGC correlation amplitude, which is confirmed by 
the Figure. The open squares display the measurements by 
G94 for the full APM catalogue, while the short-dashed line 
is the recalculation of the same with infinite sampling. The 
long-dashed line is the measurement of a subregion of the 
APM which overlaps with the EDSGC {EDSGC n APM). 
The latter agrees well within the errors with the full APM 
measurements and is slightly lower than the correspond- 
ing u'2 in the EDSGC catalogue, roughly as expected from 
the mentioned differences. There is an overall agreement be- 
tween all estimates, at least on large scales. On smaller scales 
the APM appears to produce slightly lower values; this is 
probably related to the larger discrepancy of the hierarchi- 
cal amplitudes which will be discussed next. 

Figures Wa compare the skewness, S3, and the higher 
order sj's, J = 4,5,6. The following discussion is equally 
applicable to all orders; the separate graph for J = 3 shows 
more details. Contrary to what happened for W2, a small 
difference in the depth should not change the hierarchical 
r atios, as the depth ca ncels out in the normalization (see 
(Groth & Peebles 1977)). The Figures follow this expecta- 
tion. For scales of about 0.2° to 2° the agreement is good 



between the full EDSGC and the same region of the APM 
{EDSGC n APM region). The increase of the sj's at the 
largest scales {0 > 0.5°) in the EDSGC n APM region is 
due to edge and finite volume effects: a similar trend ap- 
pears in the same region for both catalogues. On these large 
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Figure 2. Same as in Figure fll for the the hierarchical skewness 



S3 



103 /tOj- The misalignment of the open and solid squares at 



scales 3(0.5 degrees is the result of edge effects, as both correspond 
to smaller surveys. 



scales, the full APM measurements are more accurate since 
its larger area decreases cosmic errors. Note that for the mea- 
surement represented with the short dashes the edges of the 
catalog were cut out generously to eliminate any possible 
inhomogeneity. In addition, the masks was fully excluded, 
while the original measurement followed a somewhat differ- 
ent procedure (see G94 for details). This could account for 
the slight difference at the largest scales. 

The Sj's measured in the EDSGCnAPM region of the 
APM are compatible with the errors of the full APM mea- 
surements at most scales. At scales larger than 0.5°, edge 
effects start to dominate the errors of the smaller sample. 
For 0.1° >e> 0.5° the EDSGCnAPM region appears to 
produce slightly lower hierarchical ratios than the full APM. 
These values in some cases are outside of the formal error- 
bars. The reason for this is that dividing the sample into 
subsamples is an approximate estimate of the errors, and 
it can lead to underestimation as the subsamples are not 
fully independent. Moreover, for a non-Gaussian error dis- 
tribution values outside the formal errorbar are less unlikely 



( ^C96| ) 



At the smaller scales there is a significant statistical 
difference between the APM and the EDSGC. This is not 
due to finite volume effects, since it persists when only the 
same region of the sky is used. The identical geometry with 
the same magnitude cut excludes edge or discreteness ef- 
fect as well, thus all cosmic errors. The difference is not due 
to the method of estimation either, since the original low 
sampling measurement by G94 gives similar results to the 
recalculation with infinite oversampling, which fully elimi- 
nates measurement errors (pC96; Bzapudi 1997). The only 
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ing. Note that the EDSGC barely probes quasi-linear scales 
{R > 8/i~^Mpc ox 9 > 1°), thus extended perturbation the- 
ory, and results from A''-body simulations have to be invoked 
as a theoretical basis for comparison at smaller scales. There 
is hint that, at least qualitatively, the EDSGC results at 
the smallest scales follow A'^-body simulations more closely, 
while the drop experienced in the APM reduced moments 
at the same scales is unexpected, and could be an artificial 
effect. The new generation of CCD based red-shift and an- 
gular surveys, such as the SDSS, and 2DF, should be able to 
clarify this situation and put tighter constraints on biasing 
models. 
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Figure 3. Same as in Figure E for S4, S5 and S5. 



remaining possibility is that the results should be attributed 
to systematics. 



5 DISCUSSION 

According to SMN97, insufficient sampling can cause severe 
underestimation or the higher order S'.j's. This could be a 
possible cause for the disagreement between the EDGSC and 
the APM on scales smaller than 0.2 degrees, since the origi- 
nal APM measurements by G94 were performed on density 
pixel map with resolution given by the lowest scale shown at 
the figure. However, the infinite sampling Sj's are in good 
agreement with the original analysis by G94. Although, as 
expected, the infinite oversampling results at small scales 
seem slightly higher than the corresponding low sampling 
ones, the Figures prove that this effect is not significant and 
it can be discounted as the main reason for the disagreement 
between the APM and the EDSGC. The discrepancies on 
small scales are therefore due to intrinsic differences in the 
catalogues. Since both catalogs use same raw photographic 
plates, the difference discovered with the same statistical 
methods must lie with the different choices of hardware and 
software during the scans and/or the data reduction. The 
dissimilarity in the deblending algorithms is a particularly 
good candidate to account for the detected statistical differ- 
ence (G. Efstathiou, private communication). However, this 
point needs further investigation. 

Previous results and their interpretations on large scales 
seem unaffected by the detected discrepancies. In particular, 
both the APM and the EDSGC higher order correlations 
are in general agreement with perturbation theory (G94, 
GF94, BGE95, SMN97). In summary, the results support 
qualitatively scenarios with gravitational instability aris- 
ing from Gaussian initial conditions, with little or no bias- 
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